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ABSTRACT 

The focus of this study is the estimation 
procedures isiplemented in BILOG, a computer program. One purpose is 
to compare the item parameter estimates produced by various 
procedures available in BILOG. Four different models are used: the 
one, two, and three parameter model and a three parameter model with 
common guessing parameters. The results generally indicate that the 
various item parameter estimation procedures tend to yield ::imilar 
results. The major exception concerned the Bayesian and maximxim 
likelihood procedures (MLP) applied to the three parameter model. The 
MLP has a tendency to produce more extreme estimates ^.nan ^he 
Bayesian procedure. A second purpose is to compare the ability 
estimates produced by the evailable procedures: maximiim likelihood, 
expected a'posteriori, and maximum a*posteriori . The results 
indicated that: (1) robustif ication is not a strong effect on the 
mean or standard deviation of the ability estimates; (2) the mean and 
variance of the ability estimates are not strongly effected by the 
type of item parameter estimates used in calculating ability 
estimates; and (3) the effect of ability estimation procedure is 
fairly strong on the ability estimates. (PN) 
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A Comparison of Item Parameter Estimates and of Ability Parameter 
Estimates Obtained By Different Methods Implemented by BILOG^ 



A binary item response theory (IRT) model is a model for the 
relationship between binary scores on a test item and scores on a 
latent (or unot >ervable) trait. The curve expressing the 
relationship is called an item characteristic curve (ICC). The 
most popular binary IRT models are the normal ogive model and the 
one, two, and three parameter logistic models. The development 
of procedures for estimating the parameters of binary IRT models 
has a history dating back about 5C years. As Baker (1977) notes, 
initial attempts to solve the estimation problem generally 
involved substituting an observed score, usually a total score on 
the test, for the latent trait and estimating the item parameters 
of each ICC independently. Finney (1944) presented uhis kind of 
maximum likelihood estimation procedure for estimating parameters 
of the normal ogive model. Earlier Richardson (1936), Ferguson 
(1^42), and Lawley (1943) had applied the constant process, a 
generalized least squares procedure, to the estimation task. The 
maximum likelihood and constant process approach typically yield 
similar estimates (Baker, 1965) though the former can encounter 
problems when a score group has a large proportion of examinees 
answering correctly or incorrectly (Finney, 1944). Both 
approaches can be applied to the logistic models. 

Another approach to estimating item parameters of the normal 
ogive is based on Richardson's (1936) demonstration of the 
functional relationship between the item parameters of the normal 
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ogive model, and the item- latent trait correlation and item 
difficulty of classical test theory, when the item-observed 
score correlation is substituted for the item-latent trait 
correlation, an approximate procedure is obtained for estimating 
the normal ogive parameters. More recently, Urry (1974) extended 
this procedure to the three parameter normal ogive model. 

In recent years, three new maximum likelihood procedures 
have become available. These procedures, which do not require 
substituting an observed score for the latent trait score, are 
the conditioual (CML), joint (JML), and marginal maximum 
likelihood (MML) procedures. All three can be applied to the one 
parameter logistic (Rasch) model. The latter two can be applied 
to the two and three parameter logistic models. There are 
computer programs available to implement each of the CML, JML, 
and MML procedures with the logistic models. The PML 
(Gustafsson, 1977) program implements the CML procedure for the 
one parameter logistic model. BICAL (Wright, Mead and Bell, 
1979) implements the JML procedure for the Rasch model. LOGIST 5 
(Wingersky, Barton, and Lord, 1982) calculates JML estimates for 
all three logistic models, v/hereas BILOG (Mislevy & Bock, 1982) 
implements the MML for the three logistic models. Swaminathan 
(1983) gave a detailed presentation of the three types of 
estimators. 

CML and MML item-parameters estimators are consistent 
estimators. This may be a significant advantage over the JML 
estimators which are inconsistent when the number of items is 
finite. However, for the one parameter logistic model the JML 
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estimators have been shown to be consistent as the number of 
items and examinees each tend to infinity (Haberman, 1975). 
Empirical results (Lord, 1975; Swaminathan and Gifford, 1983) 
suggest this result may hold for the two and three parameter 
logistic models. 

There has also been some interest in Bayesian estimation of 
the item parameters of the logistic models. Swaminathan and 
Gifford (1982) developed a Bayesian procedure for use with the 
one parameter logistic model. The Bayesian procedure has been 
extended to the two parameter logistic model by Swaminathan and 
Gifford (1985). BILOG implements a Bayesian procedure for all 
thrt-e raodels. However, it differs in several details from the 
Swaminathan-Gif ford procedure. 

Just as there are several procedures available for 
estimating the item parameters, there are several for estimating 
the ability parameters. The name of the JML procedure derives 
from the fact that it simultaneously estimates the item and 
ability parameters. Thus there are JML estimators of ability 
parameters. Similarly the Swaminathan-Gif f ord Bayesian 
procedures simultaneously estimate both the item and ability 
parameters. Other aMlity estimation procedures assume the item 
parameters are known. Both a maximum likelihood procedure and a 
variety of Bay3sian procedures are available. BILOG incorporates 
both kinds of procedures. However, of course, in practice these 
procedures are implemented using estimates of the item 
parameters. 
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Purpose Of The Study 
The focus of this study is on the estimation procedures 
implemented in BILOG. One purpose is to compare the item 
parameter estimates produced by various procedures available in 
BILOG. Four different models were used: the one, two, and three 
parameter model and a three parameter model with common guessing 
parameters. For item parameter estimation, BILOG basically 
implements the MML and Bayesian procedures. However, th options 
available in the program give the user a fairly wide set of 
choices about the implementation of the procedures. These will 
be described in the succeeding section. 

A second purpose is to compare the ability estimates 
produced by the available procedures: maximum likelihood, 
expected a' posteriori, and maximum a' posteriori. The latter 
two are Bayesian procedures. For each of the three procedures 
biweight robustif ication is the only available option. The 
effect of robustif ication was investigated. 



Marginal Maximum Likelihood Procedure 
In this section we describe the MML procedure in the context 
of the three parameter logistic model. Let P^iQ^) denote the 
probability that the jth examinee (j=l,...,n) answers the ith 
item correctly (i=l,...,N). Let u^^ be a binary variable. For a 
correct response ^or an incorrect response ^et 

0'= [@^. . .Q^] be the vector of latent trait scores, and let a'^ 
ia^...aj^], b*= [b^...bj^], and c' = [Cj^ . . .c^^] be vectors of item 
discrimination, difficulty, and guessing parameters respectively. 
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For the jth examinee the likelihood function for t^fil'^'data is 

L{u . |©ja,b,c)= Ti[P^(0j ) j^ij [1-P^(0j ) ]^"^ij 
i 

where j^f^i j • • -'^jg j ^ • The notation emphasizes that the 
likelihood function is conditioned on the jth ability pa^-ameter 
and the item parameters for all items. For all N examinees the 
likelihood function is 

L(u|e^ ,a,b,c)= n L(Uj (6^ , a,b,c) 
j 

where =[u ' ^. . .u ' ^^3 . The JML procedure simultaneously computes 
the 6, a, b, and c that maximize the latter likelihood function. 
Thus n+3N parameters are estimated for the three parameter model. 

In the MML procedure each examinee's latent trait score (0) 
is considered to be randomly chosen from a population with 
ability distribution f(e). The marginal likelihood of the data 
for the jth examinee is 

ll/b,c)= 5L(Uj |0j,a,b,c)f (e)de 

Essentially, the marginal likelihood is obtained as a weighted 
average of the conditional likelihoods 

L(u. |e^,a,b,c) 
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where the weights are determined by f(0)* This weighting process 
removes the dependence on 6 and, therefore, in the process of 
estimating item parameters, it is not necessary to estimate the 
ability parameters for the N examinees. The function maximized 
in the MML procedure is 

L(u(a,b,c)= n L(Ujla,b,c) 
j 

To implement the MML procedure it is necessary to make an 
assumption about the form of f(e). In BILOG the default option 
is for f(9) to be a standard normal distribution. However, the 
program permits the user to specify other distributions. 

In BILOG, the distribution f (6) can be treated in either of 
two ways. In one, f(e) is treated as a distribution to be 
estimated. Thus the assumed f(e) is the basis for starti^'^g 
values in an iterative procedure for estimating f(9) and the item 
parameters. As Mislevy and Bock (1982) note this type of 
procedure is similar to the JML procedure. We refer t.^ it as 
marginal maximum likelihood with estimation of ability 
distribution MML-EAD. In the other, f (0) is treated as an 
assumption about the distribution of latent ability. The same 
distribution is employed throughout the iterative procedure for 
estimating the item parameters. This is the MML procedure. Both 
options were investigated in the study. We investigated three 
forrns for f(9). Two were a normal distribution and a uniform 
distribution, each with mean zero and standard deviation one. 
For the third, we used BILOG to estimate f (0) on one sample and 
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then used this estimate of f(e) in applying the MML procedure to 
a second sample. Both samples were chosen randomly from a larger 
sample. 

Bayesian Procedures 

The Bayesian procedures incorporate assumptions ai^out the 
distributions of item parameters. These assumed distributions 
are called prior distributions. The default prior distributions 
employed in BICAL are: normal, with mean zero and standard 

deviation two, for the difficulty parctmeters; lognormal, with 

5 11 
mean e* and variance e (e -1) for the discrimination parameters, 

and beta, with a=20p+l and 3=20 (1-p) +1. For the beta 

distribution, p is the reciprocal of the number of alternat. >^es. 

The incorporation of the prior distributions into the 

estimation procedure makes it unlikely for the estimates to occur 

in regions that are less probable according to the prior 

distribution. For example, the default i^rior distribution for 

difficulty parameters is a normal distribution with mean zero and 

standard deviation 2. Thus difficulty estimates <-2 or >2 are 

substantially less likely than estimates between - 2 and 2. 

Difficulty estimates <-4 or >4 are very unlikely to occur. The 

default prior distributions in BILOG are relatively diffuse. 

That is, they do not constrain the estimates to unreasonably 

small regions of the parameter space. However BILOG permits the 

user to tailor the priors to the specific application. Thus the 

user can use more diffuse priors or tighter priors. In addition 

the user can also choose which parameters to place priors on. In 
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the present study we implemented the Bayesian procedures by 
employing default priors on all item parameters. 

When using the Bayesian procedure in BILOG the user can 
either specify that the priors remain the san^e at each iteration 
or that the parameters of the priors be updated on each 
interation. We refer to the former as Bayesian estimation (BE) 
and the latcer as Bayesian estimation with updating of item 
priors (BE-UIP). For the kth iteration, the updating of the 
prior distribution of the item difficulties, for example, 
consists of substituting che mean of the item difficulty 
estimates from the (k-l)th iteration for the mean of the assumed 
prior distribution (which equals zero in the default prior for 
item difficulties). The updating of the other priors also 
involves substitution of the appropriate mean parameter estimate 
from the (k-l)th iteration. When we employed prior 
distributions, we investigated the effect of updating on the 
parameter estimates. 



Other Options for Item Parameter Estimation 



In addition to the options described above, with the three 
parameter model and the three parameter model with common 
guessing parameters there are three options for treatment of 
omitted items. Omitted items can be treated as incorrect, not 
presented, or fractionally correct. Mislevy and Bock (1984) 
pointed out that the second option permits an examinee to obtain 
high scores by responding only to items the examinee is sure of. 
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This optior ould seem to favor the extremely cautious examinee 
and it was not investigated. 

Method 

Instrument 

The test used in the study had 39 items and was relatively 
easy. The mean and standard deviation, for number correct 
scores, were 31.3 and 7.8. Frequency distributions for number 
correct scores, proportion-correct item difficulties, and 
item-total point biserials cire displayed in Table 1. 
Design 

The levels of the factors in the design for investigating 
parameter estimation procedures were: 

1. Model-one i^arameter, two parameter^ three parameter, and 
three parameter with common guessing parameters; 

2. Sample size - 250, 500, 750, and 1000 examinees; 

3. Ability distributions - normal^ uniform, and empirical; 

4. Estimation procecures: MML,. MML-EAD, BE, BE-UIP; 

5. Scoring of omits: incorrect and fractionally correct for 
the three parameter and three parameter common c models. 
For the one and two parameter model, only incorrect 
scoring of omits is implemented in BILOG. 

Not all possible condition combinations were investigated. In 
particular the effect of ability distribution was not 
investigated with samples of 500 or 750. With samples of 1000, 
the effect of ability distribution was only investigated in 
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connection with the three parameter model and the three parameter 
common c model . 

Three methods of estimating ability parameters were 
investigated; maximuiTi likelihood (ML), maximum a* posteriori 
(MAP), and expected a' posteriori (EAP). In addition the effect 
of biweight robustif ication was investigated. To implement each 
ability estimation procedure, a set of item parameter estimates 
is required. MML and BE estimates, obtained using both normal 
and uniform -ability distributions, were employed. This choice of 
item parameter estimates was based on the results of the 
comparison of item parameter estimates. Only the three parameter 
model was employed, and only item para leter estimates based on 
samples of 250 were employed* Again these decisions were based 
on the comparisons of the item parameter estimates. 



Insert Table 1 About Here 



Results 

Ability Distribution 

A normal, an empirical, and a uniform ability distribution 
were employed in the MML procedure to obtain three sets of 
parameter estimates for the three parameter model. For a 
particular sample size, the same sample was used with each prior 
distribution. Means and standard deviations for each set of 
estimates, based on a sample of 250, are reported in Table 2. 
Also reported are correlations between the a^s , between the 
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b 's, and between the c 's estimated using the three ability 
distributions. The results indicate that the estimates based on 
the normal and empirical distributions are quite similar. There 
is less similarity between the estimates based on the normal and 
uniform distributions and between the estimates based cn the 
empirical and uniform distributions. Nevertheless the agreement 
is still quite substantial. 



Insert Table 2 About Here 



The effect of the ability distribution on item-parameter 
estimates for the three parameter model was also examined in 
connection with three other estimation schemes: MML-EAD^ BE, and 
BE-UIP. The results for the MIIL-EAD procedure were very similar 
to those reported in Table 2. The results for the BE and BE-UIP 
were also quite siriiilar to one anotner. Results based on the BE 
procedure are reported in Table 3. Comparing the results in 
Table 3 to those in Table 2 indicates the Bayesian procedures 
were even less affecced than the marginal maximum- likelihood 
procedures were by the choice of the ability distribution. 



Insert Table 3 About Here 



The effect of ability distribution on item-parameter 
estimates was also investigated in connection with the one and 
two parameter model, and the three parameter coir.-ncn c models. 
With these simpler models, the effect of the ability distribution 
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was as small ot smaller than it was for the three parameter 
model. This trend is illustra .ed by the results, reported in 
Table 4, for the MML procedure applied with the three-parameter 
common c model. 



Insert Table 4 About Here 



The preceding results are based on estimates obtained by- 
scoring omits as wrong answers. With the three-parameter model 
(with or without a common c) omits can also be scored as 
fractionally correct. With this latter option^ the effects of 
ability distributions were snail and approximately the same as 
with the former option. 

For all of the preceding results, the sample size was 250 • 
It seemed unlikely that the effect of the prior ability 
distribution would increase as the sample size increased. 
However, to rheck this possibility, a sample size of 1000 and a 
normal, an empirical, and a uniform distribution were eniployed 
with each of the four estimation procedures applied to the three 
parameter model and the three parameter common c model. Omits 
were scored as incorrect. For the MML procedure, means, standard 
deviations, and correlations are reported in Table 5. Comparison 
of the reults in Tables 2 and 5 indicates that the effect of 
ability distribution is independent of the sample size. The 
effect of sample size was similar for the other --t-.ination 
procedures and for the three parameter common c model. 
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Insert Table 5 Abo\:t Here 



Type of Estimation Procedure 

Four different estimation procedures (MML, MML-EAD, BE, and 
BE-UIP) were employed to obtain four sets of estimates for the 
three parameter model. Means and litandard deviations for each 
set of estimates, based on a sample of 250 and a normal prior, 
are reported in Table 6. Also reported are correlations between 
the a 's, between the b 's, and between the c *s obtained by 
using the four methods. The results indicate that the two MML 
procedures yield similar estimates as c the two Bayesian 
procedures. However between the two t^^'pas of procedures 
(marginal maximum likelihood and Bayesian), the estimates are 
less similar. 



Insert Table 6 About Here 



The effect of estimation procedure was also investigated 
with three simpler models: the one and two parameter models, and 
the three paran^eter common c model. Estimation procedure had 
almost no effect with the simpler models. This is illustrated by 
results for the three parameter common c model reported in Table 
7. The estimates described by these results were calculated 
using a normal ability distribution. With the empirical and 
uniform ability distributions the results for the three parameter 
common c model were also unaffected by method of estimation. 
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Similarly, with each ability distribution the results for the one 
and two parameter models also indicated a lack of effect for 
estimation procedure 



Insert Table 7 About Here 



The preceding results are based on a sample size of 250. 
The effect of method of estimation was also investigated with a 
normal ability distribution and samples of 500, 750, and 1000 
examinees. For a seirnple size of 250 and with the three simple 
models, estimation-method effect was quite small. With larger 
sample sizes it appeared to become even smaller. With the three 
parameter model the effect of sample size depended on the 
parameter. For the a^ parameter, increasing the sample size from 
250 to 500 increased the between estimation-method correlations 
and decreased the between method differences in means and 
standard deviations. Further increases in sample size appeared 
not to effect the 'Similarity of the estimates. These trends are 
shown in Table 8. For the bg parameter, increasing the sample 
size had a negligible effect on the between estimation-method 
correlations. The between method differences in means tended to 
decrease as the sample size increases from 250 to 500 and remain 
about the same with further increases. The effect of increased 
sample size on between method differences in standard deviations 
was irregular. The between method difterences in standard 
deviations increased as the sample size changed to 500 then 
decreased as the sample si^e \ncreased to 750 and decreased again 
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as the sample size increased to 1000 examinees. This trend 
principally reflects the behavior of the largest of the estimated 
b 's. With the MML procedure, for example, the largest of the 
estimated b 's were approximately -4, -12, -8, and -5 with 2 50, 
500, 750, and 1000 examinees respectively. Thus the trend found 
for standard deviations may not occur with other tests. For the 
c parameter, there were negligible effects of increasing sample 
size on between method differences in parameter estimates. 



Insert Table 8 About Here 



As expected, the maximum likelihood procedures tended to 
result in more extreme estimates than the Bayesian procedures. 
This tendency was most marked with three parameter model and is 
illustrated in Table 9. The results in Table 9 describe the 
estimates obtained using a normal ability distribution. The 
tendency for the maximum likelihood procedures to produce extreme 
estimates was not reduced by using the empirical or the uniform 
ability distribution. With the simpler models the maximum 
likelihood procedures had less of a tendency to produce extreme 
estimates. When extreme estimates were produced the 
discrepancies between the maximum likelihood and Bayesian 
estimates tended to be smaller than they were with the three 
parameter model. 



Insert Table 9 about Here 
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Scoring of Omits 

For the three parameter model and the three parameter common 
c model, parameters were estimated with omits scored as wrong and 
omits scored as fractionally correct. The effect of the mothod 
of scoring was relatively minor for the sample of 250 and 
decreased with increasing sample size. This trend is illustrated 
in Table 10 which contains results for the three parameter model 
and MML parameter estimation. 



Insert Table 10 About Here 



Ability Estimates 

Correlations among the ML, EAP, and MAP ability c:st:.mates, 
based on MML and BE item parameter estimates obtained using a 
sample of 250 examinees and a normal ability distribution, are 
reported in Table 11. The sample size for the correlations was 
also 250 and the sample was the same as the one used to obtain 
the item parameter estimates. Tne correlations are all above 
.90. Similar results were obtained for ability estimates basea 
on MI4L and BE item parameter estimates obtained using a uniform 
ability distribution. The cross correlations between the two 
sets of ability estimates were also all above .90. 



Insert Table 11 About Here 



Means and standard deviations for the ability estimates, 
calculated using item parameters estimates based on a normal 
ability distribution, are reported in Table 12. The item 
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estimation procedures - MML and BE - had a relatively small 
effect on the mean ability estimate. Controlling for ability 
estimation procedure and robustification, the mean differences 
range in absolute value from .03 and .06^ Similarly, the effect 
on standard deviations was small. The effect of robustif ication 
was relatively small; it increased the mean ability estimate, 
with increases between .03 to .06. The effect on standard 
deviations was also small. The effect of ability estimation 
procedure - ML, EAP, ai i MAP - on means and standard deviation 
was relatively large. Moreover the effect appears to be larger 
with robustif ication than without. In general, the ML estimates 
had the largest mean and standard deviation. The MAP estinates 
had the smallest mean and standard deviation. The effect was due 
to the fact that the ML procedure produced much higher maximum 
ability estimates than either the EAP or MAP. Minimum and 
maximum ability estimates arc shown in Table 13. Both the 
differences in the mean estimates and in the rriaximum estimates 
are of sufficient size to be of practical significance. This is 
particularly true for the differences between the ML estimates 
and either the MAP or the EAP estimates. 



Insert Tables 12 and 13 About Here 



As noted earlier, the ability estimates were obtained using 
item parameter estimates that were calculated using a sample size 
of 250. Because the item parameters are treated as known in the 
ability estimation phase, increasing the sample size in the item 
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parameter estimation phase should not have any impact on the 
effect ability estimation procedure has on the ability estimates. 

The general trend in the results were the same for ability 
estimates based on item parameter estimates calculated using the 
uniform distribution. The effect of robustif ication was of about 
the same magnitude as in the preceding results. The effect of 
item estimation procedure - MML or BE - on mean ability estimates 
was, however, larger. It ranged in absolute value from .09 to 
.13. The effect of ability estimation procedure - ML, EAP, or 
IIAP - was about the same magnitude as in the preceding results. 



Summary 

The results indicate that, for the most part, the various 

item parameter estimation procedures tend to yield similar 

results. The major exception to this generalization concerned 

the Bayesian and maximum likelihood procedures applied to the 

three parameter model. With 250 examinees, correlations between 

a estimates averaged about .75 for maximum likelihood-Bayesian 

pairs of estimation procedures. For c^ estimates the correlation 

was likewise about .75. For the b estimates the correlations 

averaged about .92. For the a estimates, these correlations 

y 

increased to between .90 and .95 with sample sizes of 500, 7^0, 
and 1000 examinees. The correlatio.is for the b^ and c^ estimates 
were largely unaffected by changes in the sample sizes. 

The maximum likelihood procedure had a tendency to produce 
more extreme estimates than the Bayesian procedr.re. This 



ER?C 



20 



19 



tendency was most pronounced when the procedures were applied to 
the three parameter model. 

Ability estimation was investigated only in connection w\th 
the three parameter model. Generally, the correlations were high 
between ability parameter estimates obtained using che various 
approaches studied in this research. In addition, the results 
indicated that robustif ication did not strongly effect the mean 
or standard deviation of the ability estimates. The results also 
indicated that the mean and variance of the ability estimates 
were not strongly effected by the type of item parameter 
estimates used in calculating the ability estimates, at least 
when the item parameter estimates were based on a normal a')ility 
distribution. The effect of type of item parameter estimate was 
stronger when the item parameters were calculated using a uniform 
ability distribution. The importance of the latter finding is 
that it emphasizes the possibility thac the former results may be 
sample and/or test specific. In studying the effect of item 
estimation procedure, the sample for which ability estimates were 
obtained was also used for item parameter estimation. It is 
possible that the method of item parameter estimation might have 
a stronger effect when ability estimates are calculated for a new 
sample. This should be investigated. 

There was a fairly strong effect of ability estimation 
procedure on the ability estimates. The largest discrepancies 
were between the ML procedure, on one hand, and the EAP and MAP 
procedures on the other. Additional research shouM be 
undertaken to determine \:hether these differences will also occur 
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with the siippler models. In addition analyses of simulated data 
should be undertaken to determine wl.^ther any of the three 
procedures produces substantially biased ability estimators ^ 



Footnote 

This research was supported by a grant from the Institute 
for Student Assessment and Evaluation, University of Florida. 
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- Table 1 

Frequency Distributions for Raw Scores, Classical item Difficulties, and 
Item-Total Bioerials 



Score 
] nterval 


Frequency 


Item 
Difficulty 
Interval 


Frequency 


Biserial 
Interval 


Frequency 


11-15 


2 


<.65 


1 


.200-. 299 


5 


16-18 


14 


.650-. 699 


3 


.300-. 399 


5 


19-22 


19 


.700-. 749 


2 


.400-. 499 


8 


23-24 


37 


.750-. 799 


4 


.500-. 599 


8 


25-26 


38 


.800-. 849 


9 


.600-. 699 


9 


27-28 


43 


.850-. 899 


5 


.700-. 800 


4 


■29 


54 


.900-. 949 


7 






30 


45 


.950-. 999 


8 






31 


51 










32 


56 










33 


69 












O / 










35 


84 










36 


115 










37 


120 










38 


107 










39 


60 











Note: N=1000 examinees, n=39 items 
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Table 2 

Descriptive Statist: zs for MML Estimates Based on Three 



Ability Distributions: Three Parameter Model 



Parameter 


2UDility 
D: ^t^ibution 


N 


E 


U 


Mean 


Standard 
Deviation 


IS 


Normal 


1.00 


.99 


.92 


1.83 


.98 




Empirical 




1.00 


.89 


1.84 


1.07 




Uniform 






1.00 


1.67 


1.10 




Noriital 


1.00 


.98 


.95 


-1.2,5 


1.05 




Empirical 




1.00 


.91 


-1.- - 


1.21 




Uniform 






1.00 


-1.36 


1.10 




Normal 


1.00 


.9b 


.70 


.24 


.21 




Empirical 




1.00 


.67 


-24 


.21 




Uniform 






l.OC 


.18 


.20 



Note: N=250 examinees 
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Table 3 

Descriptive Statistics for BE Estimates Based on Three 
Ability Distributions; Three Parameter Mo^el 



Parameter 


Ability 
Distribution 




N 


E 


U 


Mean 


Standard 
Deviation 


y 


Normal 


1 


.00 


.99 


.98 


1.44 


.51 




Empirical 






1.00 


.96 


1.41 


. 52 




Uniform 








1.00 


1.44 


.51 


b 


w JL illCl JL 


1 


00 


99 


99 


-1 45 






Empirical 






1.00 


.99 


-1.53 


1.27 




Uniform 








1.00 


-1.32 


1.11 




Normal 


1 


.00 


.98 


.83 


.25 


.03 




Empirical 






1.00 


.67 


.26 


.04 




Uniform 








1.00 


.25 


.02 



Note: N=250 examinees 
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TttDle 4 

Descriptive Statistics for MML Estimates Based on Three Ability 
Distributions; Three Parameter Common c Model 



Ab i 1 i ty St andar d 

Parameter Distribution N E U Mean Deviation 





Normal 


1.00 


.99 


.97 


1.51 


.55 




Empirical 




1.00 


.96 


1.49 


.55 




Uniform 






1.00 


1.49 


.57 




Normal 


1.00 


.99 


.99 


-1.51 


1.16 




Empirical 




1.00 


.99 


-1.58 


1.23 




Uniform 






1.00 


-1.47 


1.17 


C 


Normal 








.21 






Empirical 








.21 






Uniform 








.19 





Note: N=250 examinees 
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Table 5 

Descriptive Statistics for MML Estimates Based on Three Ability 
Distributions: Three Parameter Model 





Ability 










Standard 


Parameter 


Distribution 


N 


E 


U 


Mean 


Deviation 




Normal 


1,00 


.99 


.88 


1.48 


.57 




Empirical 




1.00 


.83 


1.48 


.58 




Uniform 






1.00 


1.52 


.67 


"g 




1 no 




96 


-1. 63 


1.39 




Empirical 




1.00 


.97 


-1.64 


1.39 




Uniform 






1.00 


-1.74 


1.28 




Normal 


1.00 


.94 


.73 


.26 


.21 




Empirical 




1.00 


.72 


.28 


.20 




Uniform 






1.00 


.17 


.20 



Note: N=1000 examinees 
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' Table 6 

Descriptive Statistics for MML/ MML-EAD, BE, and BE-UIP Estimates; 
Three Parameter Model 



Estimation Standard 
Parameter Procedure MML MMT.-EAD BE BE-UIP Mean Deviduion 





MML 


1.00 


.93 


.79 


.80 


1.83 


.91 




MML -E AD 




1.00 


.71 


.71 


1.78 


. 87 




BE 






1.00 


.99 


1.53 


.53 




BE-UIP 








1.00 


1.52 


.56 




MML 


1.00 


.97 


.93 


.92 


-1.25 


1.05 




MML-EAD 




1.00 


.93 


.92 


-1.31 


1.19 




BE 






1.00 


1.00 


-1.45 


1.18 




BE-UIP 








1.00 


-1.42 


1.08 




MML 


1.00 


.93 


.73 


.74 


.24 


.21 




MML-EAD 




1.00 


.73 


.74 


.24 


.21 




BE 






1.00 


.99 


.25 


.03 




BE-UIP 








1.00 


.25 


.03 



Note: N=250 examinees, normal ability distribution 
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Table 7 



Descriptive Statistics for MML, MML-EAD, BE and BE-UIP Estimates: 
Three Parameter Common c Model 



Parameter 


Estimation 
Frocedura 


MML 


MML-EAD 


BE 


BE-UIP 


Mean 


Standard 
Deviation 




MML 


1.00 


.99 


.99 


.99 


1.52 


.56 




MML-15AD 




1.00 


1.00 


.99 


1.51 


.55 




BE 






1.00 


.99 


1.42 


.51 




BE-UIP 








1.00 


1.49 


.52 




MML 


1.00 


.99 


.99 


.99 


-1.51 


1.19 




MML-EAD 




1.00 


.99 


.99 


-1.54 


1.18 




BE 






1.00 


.99 


-1.49 


1.05 




BE-UIP 








1.00 


-1.47 


1.09 


C 


MML 










.21 






MML-EAD 










.22 






BE 










.23 






BE-UIP 










.21 





Note: N=250 examinees^ norma? ability distribution 
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• Table 8 

Descriptive Statistics for Estimates Obtained Using Various 
Sample-Size-Estimation-Procedu re Combinations: Three Parameter Model 



Sample 


Estimat ion 












Standard 


Size 


wetnou 


MML 


MML-EAD 


BE 


BE-UIP 


Mean 


Deviation 


OCA 




1 • UU 




. /9 


0 A 

. oU 


1. 83 


.91 




MML-EAD 




1 A A 
1 • 00 


• 71 


.71 


1.78 


. 87 




BE 






1 A A 
1 • 00 


A Q 

• 39 


1. 44 


.51 




EF.-UIP 








1 A A 

1 .00 


1.53 


.53 


C A A 




1 • 00 


• 99 


. 95 


A C 

. 9b 


1.35 


C A 

. 59 




MML-EAD 




1 A A 
1 • 00 


. 93 


A 0 

. 93 


1. 34 


. 61 




BE 






1 A A 

1 . 00 


.99 


1. 25 


. 51 




BE-UIP 








1 A A 

1 . 00 


1 . 26 


. 52 




lunurr 

VUfiU 


1 A A 
1 • UU 


Q Q 




. 7I 


1.33 


. 52 




MML-EAD 




1.00 


.91 


.91 


1.51 


.52 




BE 






1.00 


.99 


1.41 


.48 




BE-UIP 








1.00 


1.41 


.47 


1000 


MML 


1.00 


.99 


.94 


.95 


1.47 


.57 




MML-EAD 




1.00 


.93 


.93 


1.45 


.58 




BE 






1.00 


.99 


1.34 


.48 




BE-UIP 








1.00 


1.36 


.49 



Note: Normal ability distribution 
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Table 9 

Minimum and Maximum Item Parameter Estimates 



Parameter 



^rr *^r^ 

9 9 9 

Sample Estimation 



Size 


Procedure 


Min 


Max 


Min 


Max 


Min 


hax 


250 


MML 


.5 


5.3 


- 4.2 


.2 


.00 


.50 




BE 


.5 


2.7 


- 4.3 


.7 


.18 


.34 


500 


MML 


.2 


2.8 


-12.1 


.6 


.00 


.50 




BE 


.3 


2.3 


- 7.1 


.7 


.18 


.35 


750 


MML 


.4 


2.5 


- 8.3 


.9 


.00 


.50 




BE 


.4 


2.3 


- 6.7 


.9 


.15 


.44 


1000 


MML 


.5 


3.1 


- 5.5 


.9 


.00 


.50 




BE 


.5 


2.2 


- 5.6 


.9 


.15 


.42 



Note: Normal ability distribution 
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Table 10 

Descriptive Statistics for MML Estimates Based cn Incorrect and 
Fractionally Correct Scoring of Omits; Three Parameter Model 



Sample Scoring of Standard 
Parameter Size Omits W FC Mean Deviation 



250 


W 


l.,00 


.97 


1.83 


.91 




PC 




1 00 


7fi 


. o / 


500 


w 


1.00 


.99 


1.35 


.59 








1 00 




. □ !7 


750 


w 


1.00 


.99 


1.56 


.52 








1 00 


X . «J 


. □ ^ 


1000 


w 


1.00 


.99 


1.47 


.57 




pp 




1 00 


1 47 
X • ^ / 


. □ / 


250 


w 


1.00 


.96 


-1.25 


1.05 




pp 




1 00 


X . ^ 7 


1 OR 

X . V □ 


500 


w 


1.00 


.99 


-2.18 


2.39 




PC 




1 00 

X • V V 


^ . JL O 


^ . J J 


750 


w 


1.00 


1.00 


-1.56 


1.42 




FC 




.99 


-1.57 


1.48 


1000 


W 


1.00 


1.00 


-1.63 


1.39 




FC 




.99 


-1.63 


1.37 


250 


W 


1.00 


.84 


.24 


.21 




FC 




1.00 


.24 


.21 


5or 


W 


1.00 


.99 


.19 


.19 




FC 




1.00 


.19 


.20 


750 


W 


1.00 


.99 


.18 


.19 




FC 




1.00 


.18 


.20 


1000 


W 


1.00 


.99 


.26 


.21 




FC 




1.00 


.26 


.21 



Note: Normal ability distribution 



35 



Table 11 

Ability Estimate Intercorrelation s 



MML Item Parameter Estimation BE Item Parameter Estimation 



ML EAP MAP ML EAP MAP 



•"BW BW NBW BW NBW BW NBW BW NBW BW NBW BW 



1.00 .94 


.98 


.95 


.98 


.94 


.99 


.92 


.97 


.94 


.97 


.94 


i.OO 


.94 


.96 


.94 


.97 


.95 


.99 


.94 


.95 


.95 


.96 




1.00 


.98 


.99 


.97 


.99 


.93 


.99 


.98 


.99 


.97 






1.00 


.98 


.99 


.97 


.96 


.98 


.99 


.98 


.99 








1.00 


.98 


.99 


.94 


.99 


.98 


.99 


.97 










1.00 


.97 


.97 


.98 


.99 


.98 


.99 












1.00 


.95 


.99 


.97 


.93 


.97 














1.00 


.95 


.97 


.95 


.97 
















1.00 


.99 


.99 


.98 


















1.00 


.99 


.99 




















1.00 


.99 






















1.00 



Note: Ability estimates based on item parameter estimates obtained 
using a sample size of 250 and a normal ability distribution* The 
sample size for the correlations is also 250. 
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Table 12 

Means and Standard Deviations for Ability Estimates 



Estimation 


Procedure 


NBW 




BW 








Standard 




Standard 


Item 


Ability 


Mean 


Deviation 


Mean 


Deviation 


MML 


ML 


.05 


1.13 


.08 


1.15 




EAP 


-.02 


.91 


-.07 


.85 




MAP 


-.07 


.86 


-.09 


.79 


BE 


ML 


.08 


1.18 


.11 


1.20 




EAP 


.03 


.96 


-.03 


.89 




MAP 


-.01 


.91 


-.06 


.84 



Note: Ability estimates i>ased on item parameter estimates 
obtained using a normal ability distribution 
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Table 13 

Minimum and Maximiun Ability Estimates 



Estimation Procedure NBW BW 



Item 


Ability 


Minimum 


Maximum 


Minimum 


Maximum 


mh 


ML 


-4. CO 


2.48 


-4.00 


3.10 




EAP 


-3.99 


1.50 


-3.94 


1.17 




MAP 


-4.00 


1.35 


-3.36 


1.03 


BE 


ML 


-4.00 


2.45 


-4.00 


3.32 




EAP 


-3.56 


1.58 


-3.41 


1.26 




MAP 


-3.55 


1.45 


-3.41 


1.12 



Note: Ability estimates based on item parameter estimates 
obtained using a normal ability distribution. 
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